We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and "fake news" detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g.,~on the text of the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here our main focus is on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e.,~the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-of-the-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, prediction accuracy is found to improve by 2.5-27 macro-F1 points for the two tasks.
translated by 谷歌翻译
在本文中,我们提出了一种新方法来检测具有归因顶点的无向图中的簇。目的是将不仅在结构连接性方面,而且在属性值方面相似的顶点分组。我们通过创建[6,38]中提出的其他顶点和边缘,将顶点之间的结构和属性相似。然后将增强图嵌入到与其拉普拉斯式相关的欧几里得空间中,在该空间中,应用了修改的K-均值算法以识别簇。修改后的k均值依赖于矢量距离度量,根据每个原始顶点,我们分配了合适的矢量值坐标集,这取决于结构连接性和属性相似性,因此每个原始图顶点都被认为是$ M+1的代表增强图的$顶点,如果$ m $是顶点属性的数量。为了定义坐标矢量,我们基于自适应AMG(代数多机)方法采用了我们最近提出的算法,该方法识别了嵌入欧几里得空间中的坐标方向,以代数平滑的矢量相对于我们的增强图Laplacian,从而扩展了laplacian,从而扩展了坐标。没有属性的图形的先前结果。我们通过与一些知名方法进行比较,分析了我们提出的聚类方法的有效性,这些方法可以免费获得软件实现,并与文献中报告的结果相比,在两种不同类型的广泛使用的合成图上以及在某些现实世界中的图形上。
translated by 谷歌翻译